Goto

Collaborating Authors

 time-lapse video


Bench to Time lapseVideoGeneration

Neural Information Processing Systems

The emergence of large-scale text-to-image models [92, 60, 59, 58, 42, 5, 94, 14, 54, 40] has significantly advanced the field of Text-to-Video (T2V) generation [66,6,7,21,73,90]. Existing T2V architectures can be categorized into two types: U-Net-based and DiT-based. The latter focuses on recreating open-source structures similar to Sora [9], using the DiT (Diffusion-Transformer) [57]frameworkforT2Vgeneration [43,95,93,20]. When calculating theMTScore, thevideo retrievalmodel uses these texts toevaluate each frame ofthe video, assigning probabilities based on the matches. The final result is obtained by summing the general probability and the metamorphic probability.


ChronoMagic-Bench: ABenchmarkforMetamorphic EvaluationofText-to-Time-lapseVideoGeneration

Neural Information Processing Systems

To enable models to learn better representation spaces that simulate the real world, the larger the dataset and the richer the physical knowledge contained inthe videos, the better the training effect. Researchers often construct these large-scale datasets through web scraping.




ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Yuan, Shenghai, Huang, Jinfa, Xu, Yongqi, Liu, Yaoyang, Zhang, Shaofeng, Shi, Yujun, Zhu, Ruijie, Cheng, Xinhua, Luo, Jiebo, Yuan, Li

arXiv.org Artificial Intelligence

We propose a novel text-to-video (T2V) generation benchmark, ChronoMagic-Bench, to evaluate the temporal and metamorphic capabilities of the T2V models (e.g. Sora and Lumiere) in time-lapse video generation. In contrast to existing benchmarks that focus on the visual quality and textual relevance of generated videos, ChronoMagic-Bench focuses on the model's ability to generate time-lapse videos with significant metamorphic amplitude and temporal coherence. The benchmark probes T2V models for their physics, biology, and chemistry capabilities, in a free-form text query. For these purposes, ChronoMagic-Bench introduces 1,649 prompts and real-world videos as references, categorized into four major types of time-lapse videos: biological, human-created, meteorological, and physical phenomena, which are further divided into 75 subcategories. This categorization comprehensively evaluates the model's capacity to handle diverse and complex transformations. To accurately align human preference with the benchmark, we introduce two new automatic metrics, MTScore and CHScore, to evaluate the videos' metamorphic attributes and temporal coherence. MTScore measures the metamorphic amplitude, reflecting the degree of change over time, while CHScore assesses the temporal coherence, ensuring the generated videos maintain logical progression and continuity. Based on the ChronoMagic-Bench, we conduct comprehensive manual evaluations of ten representative T2V models, revealing their strengths and weaknesses across different categories of prompts, and providing a thorough evaluation framework that addresses current gaps in video generation research. Moreover, we create a large-scale ChronoMagic-Pro dataset, containing 460k high-quality pairs of 720p time-lapse videos and detailed captions ensuring high physical pertinence and large metamorphic amplitude.


MIT researchers train AI to predict how humans paint works of art

#artificialintelligence

MIT researchers have created an AI tool capable of generating time-lapse videos that predict how human artists use their hands to create watercolor or digital paintings. The AI is trained using time-lapse videos of people making art on Vimeo and YouTube. The probabilistic model can synthesize and predict moments in the painting process from just a single image of an artwork. The network is meant to mimic the ability skilled human artists possess to see a piece of art and comprehend the series of brush strokes or steps a person took to put it together. There are often many possible ways to create a given painting.


Sydney Startup Uses AI to Improve IVF Success Rate NVIDIA Blog

#artificialintelligence

In vitro fertilization, a common treatment for infertility, is a lengthy undertaking for prospective parents, involving ultrasounds, blood tests and injections of fertility medications. If the process doesn't end up in a successful pregnancy -- which is often the case -- it can be a major emotional and financial blow. Sydney-based healthcare startup Harrison.ai is using deep learning to improve the odds of success for thousands of IVF patients. Its AI model, IVY, is used by Virtus Health, a global provider of assisted reproductive services, to help doctors evaluate which embryo candidate has the best chance of implantation into the patient. Founded by brothers Aengus and Dimitry Tran in 2017, Harrison.ai